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ABSTRACT 

A  variety  of  detection  statistics  have  been  developed  and  applied  to  hyperspectral  imagery  (HSI).  The 
Reed  Xiaoli  (RX)  algorithm  is  a  generalized  likelihood  ratio  test  (GLRT)  that  uses  local  estimates  of  the  spectral 
mean  and  spectral  covariance.  It  satisfies  an  optimality  criterion  if,  locally,  the  spectral  data  have  a  multivariate 
normal  probability  distribution.  Alternatively,  the  stochastic  expectation  maximization  (SEM)  algorithm  may  be 
used  to  estimate  the  spectral  mean  values  and  spectral  covariance  matrices  of  a  pre-determined  number  of  classes.  A 
detection  statistic  is  computed  by  identifying  each  pixel  with  the  class  having  maximal  a  posteriori  probability  and 
applying  the  GLRT  detection  statistic  for  that  class.  These  algorithms  are  based  on  different  models  and  provide 
different  information  about  the  imagery.  For  example,  the  RX  algorithm  seeks  to  identify  local  anomalies,  and  the 
SEM  based  detector  attempts  to  discern  those  pixels  that  do  not  belong  to  one  of  the  model  classes.  Thus  we 
evaluate  the  improvement  in  detection  performance  that  results  from  developing  a  joint  RX-SEM  decision  criterion. 
The  joint  decision  boundaries  are  obtained  by  modeling  the  output  distribution  of  each  of  the  algorithms  and 
selecting  a  joint  distribution  that  further  incorporates  the  correlation  between  the  RX  and  SEM  detector  output.  The 
performance  of  the  resulting  fusion  statistics  are  compared  with  the  separate  performance  of  the  algorithms  and 
AND/OR  fusion  rules. 


1.  INTRODUCTION 


Fusion  problems  arise  in  surveillance  if  a  scene  is  observed  using  multiple  sensors,  if  data  from  a  sensor  is 
processed  using  various  techniques,  or  if  the  scene  is  surveyed  from  diverse  positions  or  at  different  times.  Using 
multiple  sensors  is  advantageous  for  detection  and  classification  problems  if  the  sensors,  e.g.,  broadband  imagers, 
foliage  penetration  radar,  and  spectral  sensors  covering  various  wavelengths,  provide  complimentary  information. 
Furthermore,  as  each  sensor  may  have  better  performance  under  certain  conditions  and  as  the  characteristics  of  the 
background,  target,  and  environment  may  be  unknown,  the  best  surveillance  strategy  for  a  given  problem  may  be  to 
fuse  the  outputs  of  several  sensors  and/or  algorithms.  Target  detection  algorithms  are  generally  derived  from 
models  of  the  sensor  data  and  an  optimality  criterion,  such  as  the  Neyman-Pearson  rule,  using  approximations 
necessitated  by  limited  knowledge  of  the  background  and  target.  Algorithms  are  often  adapted  to  uncertain 
conditions  by  estimating  parameters  of  the  underlying  model  in-situ,  however  different  modeling  approaches  may 
lead  to  fundamentally  different  techniques,  and  none  of  the  methods  may  be  universally  superior.  In  these 
circumstances  algorithm  fusion  may  be  used  to  resolve  the  inconsistencies  of  competing  detection  algorithms. 

In  this  paper  we  develop  transform  methods  to  obtain  joint-decision  contours  for  a  set  of  detection 
statistics,  and  we  apply  this  technique  to  anomaly  detection  in  hyperspectral  imagery  using  the  RX  and  SEM 
algorithms.  The  transforms  are  based  on  probability  distributions  of  the  output  of  the  individual  detection  statistics, 
and  their  correlation.  The  performance  of  the  transform  approach  to  fusion  is  compared  with  AND  and  OR  fusion 
rules  and  with  a  model  selection  approach.  The  performance  of  these  algorithms  is  compared  on  three  hyperspectral 
data  sets. 


2.  RX  AND  SEM  ALGORITHMS 

The  RX  algorithm  [7]  is  based  on  the  assumptions  that  within  a  small  region  of  a  test  pixel  (i,j)  the 
background  has  a  multivariate  normal  density  with  mean  jj.-  and  spectral  covariance  matrix  : 
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The  RX  detection  statistic  [7]  is  derived  as  a  generalized  likelihood  ratio  test: 


A(J)  = 


max  pi(y,(o) 

COG  rij 

max  Poiy,co) 

COE: 


(2) 


where  Pi(',co)  is  the  probability  density  function  of  the  observation  under  hypothesis  i 

(  Hq  :  target  absent;  :  target  present)  given  the  parameter  co.  In  application  to  RX,  co  =  [b,r] 

where  h  is  the  signature  of  the  target  spectrum,  and  F  is  the  spectral  covariance  matrix.  For  a  single  pixel  target  the 

RX  statistic  may  be  represented  as 

RX{y)  =  (y-pLyT-\y-p).  (3) 

The  RX  statistic  is  defined  more  generally  for  spatially  extended  targets  by  [7] 


a,p 


(4) 


The  stochastic  expectation  and  maximization  algorithm  (SEM)  [1,4]  has  been  applied  to  segment 
hyperspectral  imagery  into  sets  of  classes  each  of  which  is  described  by  a  multivariate  Gaussian  pdf.  Thus  the 
hyperspectral  image  is  modeled  as  having  a  Gaussian  mixture  pdf 


X  -  p  jN(p  ^ )  such  that  0  <  pj  and^^p^  =1.  (5) 

7=1  7=1 

A  detection  statistic  has  been  constructed  on  the  basis  of  SEM  segmentation.  Pixel  (ij)  is  assigned  to  the  SEM  class 
y/^j  =  k  such  that  the  conditional  probability  that  observation  Xy  came  from  class  k  is  maximal  i.e. 

k  =  arg(max pixpij  =h\Xij,\pj, Pj , 1  <  7  < mj)  (6) 

h 

The  SEM  detection  statistic  operating  on  Xy  is  then  the  RX  algorithm  for  the  given  class,  i.e.  it  is  the  class 
conditional  GLRT 


(7). 


Gamma  mixture  distributions  were  used  to  model  the  output  of  the  RX  and  SEM  algorithms.  The  gamma 
density  with  shape  parameter  v  and  scale  parameter  a  is  given  by 


Fv,aM  = 


a''r(v) 


(8) 


A  gamma  mixture  density  has  the  form 

m  m 

=  where =1  andp*^  >0.  (9) 

k=\  k=\ 


At  the  pixel  level  the  RX,  Eq.  3,  and  SEM,  Eq.  7,  algorithms  apply  a  quadratic  form 

Q{p,T){~x)  =  {~x-piyT-\x-p) 


(10) 


to  each  observation.  Thus,  if  the  background  is  normally  distributed  and  if  parameters  are  known  with  sufficient 
accuracy,  the  output  of  RX  applied  to  background  only  data  would  have  a  chi-squared  distribution  on  n  degrees  of 
freedom  (DOF)  where  n  is  the  number  of  linearly  independent  dimensions  of  the  data  [8].  The  chi-squared 
distribution  on  n  DOF  is  the  gamma  distribution  with  v  =  nf  2  ,  and  a  =  1.  If  the  background  regions  have  Gaussian 
mixture  distributions  or  if  there  is  significant  parameter  estimation  error,  then  the  RX  output  will  have  a  gamma 
mixture  distribution  [6].  Similarly,  if  the  SEM  algorithm  perfectly  assigned  pixels  to  Gaussian  classes,  the  output  of 
SEM  from  each  class  would  be  a  chi-squared  distribution.  Classification  and  parameter  estimation  errors,  however, 
can  lead  to  SEM  having  a  gamma  mixture  distribution.  The  background  may  not  be  well  fit  by  a  Gaussian  mixture 
distribution,  in  which  case  the  Gamma  mixture  distribution  may  not  provide  the  best  approach  to  modeling  the 
detector  outputs.  The  fusion  method  defined  below  is  not  dependent  on  any  particular  form  of  the  marginal  densities 
and  can  be  implemented  using  non-parametric  descriptions  of  the  pdf.  However,  the  gamma  mixture  distributions, 
with  m=2,  provide  good  descriptions  of  the  data  utilized  in  this  paper.  Eor  example,  Eigure  1  compares  the 
empirical  cumulative  distribution  of  the  SEM  algorithm  applied  to  desert  VNIR  data  and  the  estimated  2-state 
gamma  mixture  distribution. 


As  indicated  above,  we  anticipate  that  the  distribution  of  the  RX  algorithm  will  depend  in  part  on  whether 
or  not  the  neighborhood  of  a  test  pixel  is  well  modeled  by  a  Gaussian  distribution  or  a  Gaussian  mixture  distribution. 
We  use  the  BHEP  test  [2]  to  evaluate  the  goodness  of  fit  of  the  multivariate  normal  distribution  to  the  local 
background.  The  BHEP  test  statistic  compares  the  empirical  characteristic  function  of  the  data,  transformed  to  zero 
mean  and  identity  covariance,  with  the  characteristic  function  of  the  zero-mean  identity-covariance  normal 

distribution.  Let  the  sample  data  to  be  tested  be  {x^  and  lot  -  jll)  ,  where 

jil  =  E(X)md  r  =  HH^  =  cov(X) .  The  empirical  characteristic  function  of  Y  is 


V^««=-TexpOYyp,  (11) 

and  the  BHEP  test  statistic  is 


where 


D 


n,P 


VA„(f)-exp( 


2 

)  (ppit)dt. 


(12) 


(pp (t)  =  exp(-i^). 

Zp 

The  test  is  consistent,  invariant  under  affine  transformations,  and  applicable  to  any  number  of  samples  and  data 
dimensions.  The  first  three  moments  of  the  limiting  distribution,  as  the  number  of  samples  approaches  infinity,  are 
known  and  can  be  used  to  approximate  thresholds  of  the  test  statistic  corresponding  to  prescribed  probabilities  of 
type  I  error.  The  application  of  this  test  also  forms  the  basis  for  model-selection  based  fusion  described  below. 

3.  FUSION  RULES 

Model  selection  and  joint-decision  fusion  approaches  are  developed  and  applied  to  RX-SEM  output.  A 
fusion  rule  partitions  the  RX-SEM  space  into  regions  Rq  and  Rj  such  that  if  (r(x),  ^(x))  E  then  observation  x  is 

declared  to  be  target  free,  and  if  (r(x),  s(x))  E  then  observation  x  is  declared  to  come  from  a  target.  Since  RX 

and  SEM  are  positive  valued,  these  regions  may  be  defined  by  a  mapping  S  :  R^  {O5I}  such  that  R^  = 
where  R^  =  [0, 00)  . 


The  model  selection  rule  is  defined  as  follows.  For  each  observation  let  b  =  b(x^j)  =  BHEP(N(i,  j))  be 
the  BHEP  test,  Equation  12,  applied  to  the  reference  data  in  a  neighborhood  N(i,  j)  of  pixel  (ij),  and  let 
^BHEP^^RX  ’  ^SEM  thresholds  for  BHEP,  RX,  and  SEM  respectively.  Then 


rl  if  {b  <  BHEP  ^  ^  ^ RX  )  or  ^ BHEP 

^0,  otherwise 


and  5*  >  ) 


(13) 


The  AND  and  OR  fusion  rules  are  based  on  the  marginal  distributions  of  RX  and  SEM  output.  For  each 
probability  of  false  alarm,  OC,  let  and  be  the  corresponding  thresholds  of  RX  and  SEM,  respectively. 
Then 


^ AND 


fo, if  r<T^{a)oYs<T^{a) 
^  1,  otherwise 


(14) 


f  0,  if  r  <  {a)  and  s<T^  (a) 

^1,  otherwise 


(15) 


The  joint-density  method  utilizes  a  joint  pdf  defined  on  (R,S).  Let  f(r,s)  be  a  joint  RX-SEM  density,  then  f  can  be 
used  to  define  decision  regions,  Rq  and  as  follows.  For  each  c  e  [0,oo)  define 


-  { 


0,  if >c 
1,  otherwise 


(16) 


Note  that  5^^  is  a  likelihood  ratio  in  case  the  distribution  of  RX-SEM  under  the  target  present  hypothesis  is  uniform 
on  a  bounded  region  R  such  that  (r,s)e  R^  ^  S(r,  s)  =  1,  where  R^  denotes  the  complement  of  R. 

In  the  present  work,  the  joint  densities  have  been  constructed  using  a  transform  approach.  Let 
X  =  (x^ ,  •  •  • ,  )  E  be  a  random  vector  such  that  the  cumulative  distribution  function  (  CDF)  of  is  Let 

1  1  2 

0(z)  =  I  - 1  )dt  be  the  zero-mean,  unit- variance  normal  CDF.  Then  y^.  —  (x^  )  has  a 

uniform  distribution,  and  )  has  a  zero-mean,  unit-variance  normal  distribution.  Let 

C  =  COv(Zj,---,zJ,  and 

= - T/ - wexp(-L'C-'z).  (17) 

(2;?r)/2||c||/2  ^ 

Define 

fix)  =  p(r(x))||vr(x)||,  (18) 

where  r(x)  =  (4>”^('Pj(Xj  )),•••,  ^>”'('r^(x^))  and  ||Vr(x)||  is  the  determinant  of  the  Jacobian  of  T.  Then/is 

a  probability  density  function  on  such  that  the  distribution  of  x^  induced  by/,  the  k*  marginal  distribution,  is 

and  C,  the  normal  score  correlation  of/,  is  an  asymptotically  unbiased  estimator  of  the  normal  score  correlation 

of  the  underlying  distribution  [3].  Furthermore, /is  the  maximum  entropy  density  among  pdfs  having  prescribed 
marginals  and  normal  score  correlation  [3].  /is  the  pullback  of  p  via  T. 


A  modification  of  this  approach  can  be  utilized  to  obtain  a  standard  set  of  marginal  densities.  Let 
(A^ ,  •  •  • ,  )  be  a  preferred  set  of  k  marginal  distributions,  and  define  w j  —  Kj  ^  {x j  ))  .  Then  the 

marginal  distribution  of  w  =  ( ,  •  •  • ,  )  is  A  .  Define  pdf  g  to  be  the  pullback  of  p  via 

S(w)  =  (A^  (w^)),*  •  *,0”^  (A^  (Wj^ )).  For  example,  Moran  [5]  uses  this  construction  for  /:  =  2  and  A^.  a 

gamma  distribution.  We  are  evaluating  the  relative  advantages  of  standardizing  the  marginal  distributions  to 
exponential  form. 


4.  APPLICATIONS 

These  techniques  have  been  applied  to  three  hyperspectral  data  sets  and  the  results  are  displayed  in  Figures 
2-12.  In  these  figures  FR  refers  to  fusion  using  the  joint-density  approach  in  which  separate  densities  are  estimated 
for  the  sets  Bq,  mid  Bj  defined  by  =  {v  I  b(x)  <  }and  B^={x\  b{x)  >  }  FG  refers  to  fusion  using  the 

joint-density  approach  in  which  a  global  density  is  fit  to  (R,S)  output.  MS  refers  to  fusion  using  the  model  selection 
approach.  The  distributions  of  RX  and  SEM  output  were  fit  to  2-term  gamma  mixture  densities  using  the 
expectation-maximization  algorithm.  The  targets  are  indicated  as  red  diamonds 

Data  set  one  is  VNIR  hyperspectral  data  collected  over  a  desert.  Figure  2  is  a  scatter  plot  of  spatially 
associated  RX-SEM  output  with  an  overlay  of  the  contours  of  the  probability  density  obtained  using  the  transform 
method  defined  above.  Figure  3  compares  ROC  curves  of  SEM,  RX  and  FR.  The  six  outermost  targets  are  detected 
without  false  alarms  by  SEM  and  FR,  whereas  RX  incurs  a  significant  number  of  false  alarms  to  detect  these  targets. 
Figure  4  compares  the  performance  obtained  using  fusion  rules  FG,  FR,  and  MS.  FR  is  able  to  detect  six  targets 
without  false  alarms,  whereas  FG  can  only  detect  five  without  false  alarm.  Furthermore,  FR  incurs  approximately 
an  order  of  magnitude  fewer  false  alarms  than  MS  to  detect  the  seventh  target.  Figure  5  compares  the  performance 
of  fusion  rules  OR,  AND,  and  FR.  OR  and  FR  are  comparable  and  detect  the  outermost  targets  without  false  alarms, 
whereas  AND  incurs  a  significant  number  of  false  alarms  at  threshold  settings  that  detect  these  six  targets. 

Data  set  two,  which  is  VNIR  hyperspectral  data  collected  over  a  forest,  is  similarly  analyzed  in  Figures  6-9. 
The  six  (PD=0.33)  outermost  targets  in  Figure  6  are  detected  with  fewer  false  alarms  using  one  of  the  fusion 
approaches  than  using  either  RX  or  SEM  alone  as  evidenced  in  the  ROC  curves.  At  this  level,  from  Figure  8,  one 
sees  that  FR  has  approximately  half  as  many  false  alarms  as  FG  and  MS.  From  Figure  9,  at  PD=0.3,  one  sees  that 
FR  has  about  an  equal  number  of  false  alarms  as  AND  and  approximately  half  as  many  as  OR.  The  next  three 
targets  are  substantially  further  down  in  the  clutter  in  Figure  6,  and  the  number  of  false  alarms  incurred  to  detect 
them  goes  up  by  approximately  two  orders  of  magnitude  using  any  of  the  fusion  techniques.  From  this  point  to 
detection  of  all  targets  the  fusion  result  is  comparable  to  the  better  of  the  two  algorithms,  SEM,  but  neither  is 
satisfactorily  separating  the  targets  from  the  clutter. 

Data  set  three,  which  is  LWIR  hyperspectral  data  from  a  forest,  is  analyzed  in  Figures  10-12.  The  data  set 
contains  seven  targets,  and  several  of  the  targets  are  represented  by  more  than  one  diamond  in  Figure  10.  We  see, 
from  Figure  11,  significant  fusion  gain,  a  reduction  in  the  number  of  false  alarms  by  1.5  and  2  orders  of  magnitude, 
using  FR  rather  than  RX  or  SEM,  respectively,  at  threshold  levels  sufficient  to  detect  the  four  outermost  targets. 

The  other  three  targets  are  well  inside  the  clutter,  and  the  performance  of  FR  is  comparable  to  RX,  which  in  this  case 
outperforms  SEM.  From  Figure  12  we  see  that  FR  outperforms  OR  and  AND  in  detecting  the  four  outermost  targets 
by  an  order  of  magnitude  in  the  number  of  false  alarms,  and  their  performance  is  comparable  at  higher  probabilities 
of  false  alarm.  The  model  selection  fusion  approach  would  default  to  the  SEM  algorithm  in  this  case  as  all  pixels 
evaluated  failed  the  BHEP  test. 


5.  CONCLUSIONS/FUTURE  DIRECTIONS 

This  study  has  demonstrated  improved  performance  by  following  RX  and  SEM  processing  with  a  fusion 
algorithm.  At  thresholds  such  that  either  RX  or  SEM  has  fewer  than  10-100  false  alarms  per  km^  fusion  has  been 
shown  to  reduce  false  alarms  by  0.25  to  2  orders  of  magnitude.  Furthermore,  if  thresholds  are  set  so  that  both 
algorithms  produce  more  than  approximately  100  false  alarms  per  km^,  the  performance  of  the  fusion  algorithm  was 
comparable  to  or  better  than  either  RX  or  SEM  alone. 


This  work  has  also  shown  that  the  FR  fusion  algorithm  produces  more  consistent  results  then  either  of  the 
other  algorithms  considered  in  this  study.  Table  1  shows  the  best  algorithm  from  the  set  of  choices  for  each  data 
set.  If  the  choices  are  SEM  and  RX,  then  an  algorithm  selection  criterion  should  be  developed  to  determine  the 
conditions  under  which  each  algorithm  should  be  used,  as  SEM  is  the  algorithm  of  choice  for  data  sets  one  and  two, 
while  RX  achieves  better  results  for  data  set  3.  The  model  selection  criterion  based  on  the  BHEP  test  worked  quite 
well  for  data  sets  1  and  2,  as  evidenced  in  Figures  4  and  8.  However,  this  test  would  have  selected  SEM  for  all  of 
data  set  3,  and  RX  was  the  better  algorithm  to  use  for  these  data.  Similarly  AND  is  preferred  over  OR  on  data  set  2, 
but  OR  is  preferred  over  AND  on  data  sets  1  and  3.  Thus  if  these  fusion  rules  are  to  be  adopted,  then  a  selection 
criterion  needs  to  be  developed.  When  the  choices  include  FR  it  is  the  preferred  approach,  and  its  performance  is 
very  similar  to  OR  on  data  set  1  and  to  AND  on  data  set  2.  On  data  set  3,  FR  and  EG  are  identical  as  all  pixels  lie  in 
the  non-Gaussian  class.  On  data  sets  one  and  two,  however,  FR  is  preferred  over  EG. 


Table  1.  The  optimal  algorithm  from  the  set  of  choices  for  each  data  set. 


Algorithm  Choices 

Data  Set  1 

Data  Set  2 

Data  Set  3 

SEM,  RX 

SEM 

SEM 

RX 

SEM,  RX,  AND,  OR 

OR 

AND 

OR 

SEM,  RX,  AND,  OR, 

MS,  ER,  EG 

OR-ER 

AND-ER 

ER=EG 

There  are  many  issues  to  address  in  algorithm  and  sensor  fusion.  We  are  investigating  alternate  means  of 
modeling  the  marginal  distributions  and  constructing  the  joint  density.  We  are  evaluating  the  relative  merits  of  pixel 
level  and  spatially  associated  fusion.  We  are  analyzing  the  computational  efficiency  and  accuracy  of  various 
parameter  estimation  techniques.  We  will  be  incorporating  other  detection  algorithms,  and  we  intend  to  apply  these 
techniques  to  certain  sensor  fusion  problems. 
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Tail  Probability 


Detector  Threshold 

Figure  1.  Two-term  gamma  mixture 
density  fit  to  SEM  detector  output  from 
desert  background  data  having  parameters 
Pi  =0.962,  p2  =  0.038,  Vi=15.7,V2=2.44, 
(27=0.035,  (22=0.305. 


Figure  2.  Scatter  plot  of  spatially 
associated  RX-SEM  output  from  desert 
data  and  contours  of  minus  the  natural 
logarithm  of  the  joint  density. 
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Figure  3.  ROC  comparison  of  RX,  SEM, 
and  ER  applied  to  desert  VNIR  data. 


Figure  4.  ROC  comparison  of  fusion 
methods  ER,  EG,  and  MS  applied  to  desert 
VNIR  data. 
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Figure  5.  ROC  comparison  of  fusion 
methods  FR,  AND,  and  OR  applied  to 
desert  VNIR  data. 


Figure  7.  ROC  comparison  of  FR,  RX, 
and  SEM  applied  to  forest  VNIR  data. 


Figure  6.  Scatter  plot  of  spatially 
associated  RX-SEM  output  from  VNIR 
forest  data  and  contours  of  minus  the 
natural  logarithm  of  the  joint  density. 


Figure  8.  ROC  comparison  of  fusion  methods 
FR,  MS,  and  EG  applied  to  forest  VNIR  data. 


Figure  9.  ROC  comparison  of  fusion 
methods  FR,  AND,  and  OR  applied  to 
forest  VNIR  data. 
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Figure  10.  Scatter  plot  of  spatially  associated  RX- 
SEM  output  from  LWIR  forest  data  and  contours  of 
minus  the  natural  logarithm  of  the  joint  density. 


Figure  11.  ROC  comparison  of  FR,  RX, 
and  SEM  applied  to  forest  LWIR  data. 


Figure  12.  ROC  comparison  of  fusion  methods 
FR,  AND,  and  OR  applied  to  forest  LWIR  data. 


